Protoforms of Linguistic Database Summaries as a Human Consistent Tool for Using Natural Language in Data Mining
نویسندگان
چکیده
We consider linguistic database summaries in the sense of Yager (1982), in an implementable form proposed by Kacprzyk & Yager (2001) and Kacprzyk, Yager & Zadrożny (2000), exemplified by, for a personnel database, “most employees are young and well paid” (with some degree of truth) and their extensions as a very general tool for a human consistent summarization of large data sets. We advocate the use of the concept of a protoform (prototypical form), vividly advocated by Zadeh and shown by Kacprzyk & Zadrożny (2005) as a general form of a linguistic data summary. Then, we present an extension of our interactive approach to fuzzy linguistic summaries, based on fuzzy logic and fuzzy database queries with linguistic quantifiers. We show how fuzzy queries are related to linguistic summaries, and that one can introduce a hierarchy of protoforms, or abstract summaries in the sense of latest Zadeh’s (2002) ideas meant mainly for increasing deduction capabilities of search engines. We show an implementation for the summarization of Web server logs.
منابع مشابه
Linguistic database summaries and their protoforms: towards natural language based knowledge discovery tools
We consider linguistic data(base) summaries in the sense of Yager [Information Sciences 28 (1982) 69–86], exemplified by ‘‘most employees are young and well paid’’ (with some degree of truth added), for a personnel database, as an intuitive, human consistent and natural language based knowledge discovery tool. We present first an extension of the classic Yager s approach to involve more sophist...
متن کاملTowards human consistent data driven decision support systems using verbalization of data mining results via linguistic data summaries
We present how the conceptually and numerically simple concept of a fuzzy linguistic database summary can be a very powerful tool for gaining much insight into the essence of data that may be relevant for a business activity. The use of linguistic summaries provides tools for the verbalization of data analysis (mining) results which, in addition to the more commonly used visualization e.g. via ...
متن کاملLinguistic Database Summaries Using Fuzzy Logic: towards a Human-consistent Data Mining Tool
We discuss an approach to fuzzy linguistic summaries of data (bases) in the sense of Yager, i.e., for instance, if we have a (large) database on employees, and we are interested in a relation between the age and qualifications, then it may be summarized by, say, “most young employees are well qualified”. We present the derivation of such linguistic summaries in the context of Zadeh’s computing ...
متن کاملLinguistic data summarization: a high scalability through the use of natural language?
We discuss aspects related to the scalability of data mining tools meant in a different way than whether a data mining tool retains its intended functionality as the problem size increases. We introduce a new concept of a cognitive (perceptual) scalability meant as whether as the problem size increases the method remains fully functional in the sense of being able to provide intuitively appeali...
متن کاملOn Multi-subjectivity in Linguistic Summarization of Relational Databases
We focus on one of the most powerful computing methods for natural-language-driven representation of data, i.e. on Yager’s concept of a linguistic summary of a relational database (1982). In particular, we introduce an original extension of that concept: new forms of linguistic summaries. The new forms are named Multi-Subject linguistic summaries, because they are constructed to handle more tha...
متن کامل